Development and validation of multi-omic prognostic signature of anoikis-related genes in liver hepatocellular carcinoma

Liver hepatocellular carcinoma (LIHC) is characterized by high morbidity, rapid progression and early metastasis. Although many efforts have been made to improve the prognosis of LIHC, the situation is still dismal. Inability to initiate anoikis process is closely associated with cancer proliferation and metastasis, affecting patients’ prognosis. In this study, a corresponding gene signature was constructed to comprehensively assess the prognostic value of anoikis-related genes (ARGs) in LIHC. Using TCGA-LIHC dataset, the mRNA levels of the differentially expressed ARGs in LIHC and normal tissues were compared by Student t test. And prognostic ARGs were identified through Cox regression analysis. Prognostic signature was established and then externally verified by ICGC-LIRI-JP dataset and GES14520 dataset via LASSO Cox regression model. Potential functions and mechanisms of ARGs in LIHC were evaluated by functional enrichment analyses. And the immune infiltration status in prognostic signature was analyzed by ESTIMATE algorithm and ssGSEA algorithm. Furthermore, ARGs expression in LIHC tissues was validated via qRT-PCR and IHC staining from the HPA website. A total of 97 differentially expressed ARGs were detected in LIHC tissues. Functional enrichment analysis revealed these genes were mainly involved in MAP kinase activity, apoptotic signaling pathway, anoikis and PI3K-Akt signaling pathway. Afterward, the prognostic signature consisting of BSG, ETV4, EZH2, NQO1, PLK1, PBK, and SPP1 had a moderate to high predictive accuracy and served as an independent prognostic indicator for LIHC. The prognostic signature was also applicable to patients with distinct clinical parameters in subgroup survival analysis. And it could reflect the specific immune microenvironment in LIHC, which indicated high-risk group tended to profit from ICI treatment. Moreover, qRT-PCR and IHC staining showed increasing expression of BSG, ETV4, EZH2, NQO1, PLK1, PBK and SPP1in LIHC tissues, which were consistent to the results from TCGA database. The current study developed a novel prognostic signature comprising of 7 ARGs, which could stratify the risk and effectively predict the prognosis of LIHC patients. Furthermore, it also offered a potential indicator for immunotherapy of LIHC.


Introduction
As the most common subtype of primary liver cancer, liver hepatocellular carcinoma (LIHC) is the second deadliest cancer globally. [1]In recent years, global incidence and death rate of LIHC presents a gradual upward trend.Scientists predict transplantation, have been widely used in clinical application, dismal 5-year survival rate of <20% and early diagnosis rate of <30% still exist in some developing countries. [3,4]High treatment expenditure, severe drug adverse reaction and lack of advanced diagnosis and treatment technology also reduce patients' treatment participation. [5]In all, survival status of LIHC patients is unsatisfactory.In addition, with the in-depth study of tumor heterogeneity, traditional prognostic assessment according to TNM stage alone is being challenging.Thus, developing a prognostic signature based on molecular profiles of LIHC is promising.
Anoikis is a specific form of apoptotic event that can be initiated when cells lose their interaction with the adjacent extracellular matrix (ECM). [6,7]It is indispensable to maintain the stability of body tissues, and its main function is to prevent abnormal cell growth or cell adhesion to abnormal ECM.For example, when cells are detached from ECM, the loss of integrin-mediated survival signals will activate Bcl-2-modifying factor, thus activating the process of anoikis. [8]evertheless, cancer cells can develop anoikis resistance occasionally through circumventing death signaling pathway.These cells with anoikis resistance may survival while spreading from the primary site to distant organs and participate in the emergence of metastases. [6,9]In this process, anoikis-resistant cancer cells also show high VEGFA expression and promote angiogenesis. [10]At present, dysregulation of cell apoptosis is considered as a key factor in tumorigenesis and tumor development.Thus, exploring expression status and prognostic value of anoikis-related genes (ARGs) in LIHC as well as their correlations with tumor microenvironment is of great significance.
Previous studies have reported that ARGs participated in the process of tumor metastasis and progression and was associated with patients' prognosis in various human tumor types.For example, Takagi et al suggested high KLF5 expression contributed to anoikis resistance and poor prognosis in colorectal cancer patients.In addition, through promoting cell proliferation and cancer stem cell-like properties, KLF5 also contributed to hepatic metastases of colorectal cancer. [11]Shimokawa et al reported NQO1 activation was associated with LIHC metastasis and poor prognosis through generating anoikis resistance in anchorage-independent culture. [12]Kim et al suggested that CEACAM6 overexpression was correlated with unfavorable prognosis in lung adenocarcinoma and inhibited anoikis via the activation of Src-FAK signaling pathway. [13]In pancreatic cancer, Jiang et al suggested upregulated EDIL3 expression was related to shorter survival time, and inhibition of EDIL3 disrupted anoikis resistance and anchorage-independent tumor growth. [14]he abovementioned articles have proved the prognostic value of ARGs in corresponding cancer types.As for LIHC, some studies have started to focus on the role of ARGs in LIHC, but few studies discussed the relationship between integrated ARGs and survival outcomes.In this study, we analyzed the expression status, mutation status and prognostic performance of ARGs in LIHC.We built up and verified an ARGs-related prognostic signature using integrated expression data and survival data.Besides, the correlation between tumor immune microenvironment and the signature was also evaluated.The current study may offer powerful evidences on risk stratification, prognosis evaluation and immunotherapy of LIHC.

Expression analysis and mutation analysis
Using "edgeR" package in R, the differential expression analysis of all the genes in TCGA-LIHC dataset was conducted with the following threshold: false discovery rate (FDR) < 0.05 and |fold change (FC)| > 1. Afterward, differentially expressed ARGs (DEARGs) were acquired by interacting with the 404 ARGs.The process was visualized by Venn plot.The expression data was standardized to fragments per kilobase million values and was compared by Student t test.Using "reshape2" and "limma" packages, the box plots showed the transcriptional levels of DEARGs in LIHC tissues and normal liver tissues.CNV and SNV status of target genes in LIHC as well as the waterfall plot were analyzed and visualized by "maftools" package.The corresponding variation locations of these genes on chromosomes were visualized by "RCircos" package.

Protein-protein interaction network construction and functional enrichment analysis
The abovementioned expression analysis identified the DEARGs in LIHC.Next, based on STRING (https://cn.string-db.org/)website, [15] the associations among these genes were expressed as a protein-protein interation (PPI) network.To investigate the potential molecular mechanisms and biological functions of DEARGs, gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed based on "clusterProfiler" package.GO analysis included 3 dimensions: biological process (BP), cellular component (CC) and molecular function (MF).The threshold of functional enrichment analysis was set as FDR < 0.05 and q-value < 0.2.

Prognostic analysis and prognostic signature construction
Setting the median expression level as cutoff, the prognostic performance of DEARGs was also analyzed by "survival" package in R and was visualized as a forest graph.After prognostic-related DEARGs were screened out, LASSO cox regression analysis in "glmnet" package was used to establish the prognostic signature.Riskscore = n i=1 (Expi × Coei) .Expi was the expression value of each prognostic-related DEARG in LIHC, and the Coei was the corresponding multivariable Cox regression coefficient.Selecting the median risk score as cutoff, patients were assigned to high-risk and low-risk group.Overall survival curves of the 2 groups were compared by Kaplan-Meier analysis in "survival" package.Predictive accuracy of the prognostic signature was evaluated by Time ROC analysis in "timeROC" package.ICGC-LIRI-JP dataset and GSE14520 dataset were also obtained from International Cancer Genome Consortium (ICGC, https://dcc.icgc.org/) and Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/)database to externally validate predictive power, respectively.Moreover, univariate and multivariate Cox regression analysis were conducted to explore the independent prognosis indicators.Afterward, subgroup survival analyses were utilized to explore whether the prognostic signature was also applicable to patients with different clinical parameters.In addition, the risk score differences in multiple subgroups was also evaluated.

Immune analysis of the prognostic signature
Previous studies suggested immune microenvironment was of great importance in the occurrence and progression of tumor.To evaluate the overall immune microenvironment in the 2 groups, the ESTIMATE algorithm was used to assess the StromalScore, EstimateScore, and ImmuneScore. [16]Next, the infiltrating levels of 24 distinct immunocyte subsets in high-risk and lowrisk group were also investigated by ssGSEA algorithm. [17]urthermore, the association between risk score and some common immune checkpoints (CD274, CTLA4, HAVCR2, TIGIT, PDCD1, LAG3, PDCD1LG2, and SIGLEC15) expression was also analyzed.

Validation of expression status of the risk score-related ARGs
Approved by the Ethics Committee of the Second Affiliated Hospital of Nanchang University, tumorous tissues and adjacent normal tissues from 40 LIHC patients without any preoperative therapy were collected for the present study.All these patients were informed and provided written consent.Quantitative real-time polymerase chain reaction (qRT-PCR) was performed as previously described. [18]The 2 −ΔΔCt method was used to calculate the expression of each gene relative to the housekeeping gene.Primers used in this study were presented in Supplementary table 2, http://links.lww.com/MD/K769.The transcriptional level of risk score-related ARGs in tumorous tissues and adjacent normal tissues were compared by Student t test.Furthermore, we also downloaded the immunohistochemical (IHC) staining data from The Human Protein Atlas (HPA, https:// www.proteinatlas.org/) to verify their protein levels. [19]

Functional enrichment analysis of DEARGs
The above expression analysis indicated that 97 ARGs were differentially expressed in LIHC tissues.To predict their potential biological functions and molecular mechanisms, functional enrichment analyses were conducted.In Figure 2A, GO analysis, including BP, CC and MF analysis, illustrated these DEARGs were mainly associated with regulation of MAP kinase activity, regulation of protein serine/threonine kinase activity, regulation of apoptotic signaling pathway and anoikis.KEGG pathway analysis revealed these DEARGs were primarily correlated with the process of PI3K-Akt signaling pathway, ECM-receptor interaction, endocrine resistance, progesterone-mediated oocyte maturation and EGFR tyrosine kinase inhibitor resistance (Fig. 2B).
The results of functional enrichment analysis further indicated their correlations with anoikis.

Establishment and validation of an anoikis-related prognostic signature
For better predicting LIHC patients' prognosis, LASSO Cox regression method was utilized to analyze the 33 prognostic-related DEARGs in the abovementioned survival analyses, and finally identified 7 genes (BSG, ETV4, EZH2, PLK1, PBK, NQO1, and SPP1) for prognostic risk model construction.

Risk score correlated with clinical characteristics
The influence of clinical characteristics on risk score was also analyzed.According to patients' gender, age, T stage, tumor grade and clinical stage, patients were assigned to distinct subgroups to compare their risk score.In the subgroup of age (≥60 years or not) and gender, no statistical significance was detected between the 2 group (Fig. 7A and 7B, all P value>.05).However, patients with T3-T4 stage had higher risk score than those with T1-T2 stage (Fig. 7C, P value = 1.6e-04), and LIHC patients with tumor grade 3-4 showed higher risk score than those with tumor grade 1-2 (Fig. 7D, P value = 4.1e-08).In addition, compared with patients with clinical stage 1-2, patients with clinical stage 3-4 also revealed higher risk score (Fig. 7E, P value = 1.8e-04).

Risk score correlated with immune cell infiltration
Tumor immune microenvironment was considered as one of the factors that influenced tumor biological behavior.For exploring the relationship between risk score and immune cell infiltration, we first utilized the ESTIMATE algorithm to calculate StromalScore, ImmuneScore and EstimateScore in high-risk and low-risk group.The result revealed higher ImmuneScore was detected in high-risk group (Fig. 8B, P value = 7.4e-03).
As for StromalScore (Fig. 8A, P value>.05)and EstimateScore (Fig. 8C, P value>.05),no statistical significance was observed between high-risk and low-risk group.Afterward, according to ssGSEA algorithm, the infiltrating levels of 24 kinds of immunocytes in the 2 groups were compared.As shown in Figure 8D, the infiltrating levels of activated dendritic cell, immature dendritic cell, macrophages, NK CD56bright cells, T helper cells, TFH cells, Th1 cells and Th2 cells were upregulated in high-risk group, while the infiltrating levels of CD8 + T cells,

Risk score correlated with the expression of immune checkpoint
Currently, checkpoint inhibitor-based immunotherapy played a crucial role in cancer immunotherapy. [20]Thus, we investigated immune checkpoint response status in the 2 groups.As illustrated in Figure 9A, the expression levels of CTLA4, HAVCR2, LAG3, PDCD1 and TIGIT were significantly upregulated in high-risk group.Nevertheless, there was no significant difference in CD274, SIGLEC15 and PDCD1LG2 expression between the 2 group.Next, using spearman correlation analysis, the current study visualized the association between risk score and CTLA4 (cor = 0.421, P value<.001),HAVCR2 (cor = 0.473, P value<.001),LAG3 (cor = 0.187, P value<.001),PDCD1 (cor = 0.361, P value<.001)and TIGIT (cor = 0.320, P value<.001)expression, which indicated the expression levels of these 5 immune checkpoints were positively correlated with risk score (Fig. 9B-9F).These results revealed that high-risk group was likely to generate immunosuppressive microenvironment.

Discussion
LIHC is the malignant tumor with high rates of recurrence and metastasis, which causes a disproportionate health burden to patients. [21]The dismal overall 5-year survival rate of <20% is partly due to metastasis of LIHC. [3]As a specific form of cell apoptosis, previous studies suggested anoikis resistance of detached cancer cells is a key process of metastasis and is a hallmark of metastasis. [22]Some of the ARGs were also identified as prognostic indicators and regulated occurrence and progression of corresponding cancer cells. [11- 14]With the development of sequencing technique and establishment of genome databases, large-scale and comprehensive analyses on ARGs in LIHC become feasible.However, comprehensive bioinformatic analyses focusing on the expression status and prognostic performance of ARGs in LIHC as well as their associations with tumor immune microenvironment is lacking.We firstly explored the mRNA levels of ARGs in LIHC tissues and normal liver tissues.In the expression analysis, 97 genes (67 upregulated genes and 30 downregulated genes) were differentially expressed.These 97 DEARGs were next enrolled in functional enrichment analysis.The result revealed these genes were mainly involved in regulation of MAP kinase activity, regulation of protein serine/threonine kinase activity, regulation of apoptotic signaling pathway, anoikis, PI3K-Akt signaling pathway, ECM-receptor interaction and EGFR tyrosine kinase inhibitor resistance.In previous publications, activation of PI3K/Akt signaling pathway was considered as the most common form to result in anoikis resistance in cancer cells.On the one hand, through regulating phosphorylation of pro-apoptotic proteins, such as BAD and pro-CASP9, activation of Akt could directly inhibit their functions. [23,24]On the other hand, sustained Akt activation was often accompanied by elevated N-cadherin expression.N-cadherin induced anoikis resistance by recruiting PI3K. [25]Cancer cells could adapt to the loss of anchorage and generate anoikis resistance by activating EGFR family members [26] or MAPK/ERK signaling pathway-mediated suppression of BIM. [26,27]In addition, anoikis was an apoptotic process indeed, so it was also modulated by both intrinstic and extrinsic apoptotic pathways. [28]These abovementioned perspectives provided powerful support for the results of GO and KEGG pathway analysis.The 97 DEARGs were enrolled to perform survival analysis, and 33 prognostic-related DEARGs were selected out.Risk score in the prognostic signature consisted of BSG, ETV4, EZH2, NQO1, PLK1, PBK and SPP1.Patients with high risk score in TCGA-LIHC cohort were associated with unfavorable prognosis.ICGC-LIRI-JP cohort and GSE14520 cohort were downloaded for external validating the prognostic model, and showed similar results with TCGA-LIHC cohort.According to the results of Time ROC analysis, the prediction model showed moderate to high predictive accuracy on the whole.Another bioinformatic analysis building up ARGs-related prognostic signature for patients with endometrial carcinoma also indicated an unfavorable prognosis in high-risk group. [29]Nevertheless, large-scale and multi-center studies are required to verify these results.
TME in LIHC is generally thought to be immunosuppressive, which allows cancer cells to evade anti-cancer immunity and promote tumor growth and metastasis. [30]For instance, cancer cells evaded immune surveillance by inhibiting functions of DC. [31] FOXP3 + Tregs inhibited the production and activation of anti-tumor effector cells, thus generating immunosuppressive functions in TME. [32]In our study, the results indicated that the upregulated infiltrating levels of activated dendritic cell, immature dendritic cell, macrophages, NK CD56bright cells, T helper cells, TFH cells, Th1 cells and Th2 cells in high-risk group, while the decreased infiltrating levels of CD8 + T cells, cytotoxic cells, DC, pDC, Th17 cells and Treg cells in high-risk group.High-risk group also showed higher ImmuneScore.The result indicated there were differences in TME between the 2 groups.Currently, immune checkpoint inhibitor (ICI) therapy is considered as a solution to treat unresectable cancer and is one of adjuvant therapies after surgery. [33]The current study showed CTLA4, HAVCR2, LAG3, PDCD1, and TIGIT expression were significantly elevated in high-risk group and were positively correlated with risk score.According to the theory of "hot" tumor and "cold" tumor, [34] high-risk group might be the "hot" tumor and tend to profit from ICI treatment.
Using qRT-PCR and IHC staining data from HPA website, higher expression levels of the 7 risk score-related ARGs in LIHC tissues, including BSG, ETV4, EZH2, NQO1, PLK1, PBK and SPP1, were detected.In previous publications, EZH2, a cancer-associated protein with histone-methyltransferase activity, was overexpressed and was related to poor prognosis in LIHC. [35]As for NQO1, [36] PLK1 [37] and BSG, [38] the prognostic performance and expression status were similar with EZH2.Wang et al suggested SPP1 was upregulated in LIHC tissues and multiple LIHC cell lines and enhanced LIHC cell growth through targeting miR-181c. [39]Yang et al reported overexpression of PBK promotes metastasis of LIHC by regulating ETV4-uPAR signaling pathway and PBK expression was increased in LIHC tissues both on mRNA and protein level. [40]n addition, the overexpression of ETV4 was also detected in LIHC cell lines and tissues, and in vivo and in vitro experiments revealed ETV4 promoted migration and invasion of LIHC cells. [41]To some extent, the above studies showed similar expression status of the 7 ARGs in LIHC with the current study.Some limitations also should be acknowledged.Firstly, most of the analyses were performed on mRNA level, some results might be inapplicable to studies on protein level.Secondly, the underlying mechanisms of ARGs-related signature to influence prognosis, immune cell infiltration and immune checkpoints should be further studied by experiments.Thirdly, many factors, such as race, age and gender, also influenced patients' genetic background, more in-depth studies were required to study this field.

Conclusions
In all, our study established an anoikis-related gene signature for LIHC patients.It was conducive to prognosis prediction of LIHC patients and reflected immune cell infiltration and

Figure 2 .
Figure 2. Functional enrichment analysis of the 97 DEARGs.(A) The bar plot of GO analysis, including BP, CC and MF analysis.(B) The bar plot of KEGG pathway analysis.DEARG = differentially expressed anoikis-related gene, GO = gene ontology, KEGG = Kyoto Encyclopedia of Genes and Genomes.

Figure 3 .
Figure 3. Expression status and mutation overviews of the 33 prognostic-related ARGs in LIHC.(A) The mRNA levels of prognostic-related ARGs in LIHC and normal liver tissues.(B-C) The mutation frequency and classification of prognostic-related ARGs in LIHC.(D) The CNV variation frequency of prognostic-related ARGs in LIHC.The height of the column represented the alteration frequency.(E) The location of CNV alteration of prognostic-related ARGs on chromosomes in LIHC.Note: *P < .05,**P < .01,***P < .001.ARG = anoikis-related gene.CNV = copy number variation, LIHC = liver hepatocellular carcinoma.

Figure 4 .
Figure 4. Prognostic model constructed by the 33 prognostic-related ARGs.The partial likelihood deviance (A) and coefficients (B) of the prognostic model.Risk score distribution, patients' survival status and expression status of the 7 ARGs associated with risk score in TCGA-LIHC dataset (C), in ICGC-LIRI-JP dataset (D) and GSE14520 dataset (E).The OS plots and corresponding ROC curves of high-risk group and low-risk group in TCGA-LIHC dataset (F, I), in ICGC-LIRI-JP dataset (G, J) and GSE14520 dataset (H, K).ARG = anoikis-related gene, ICGC = International Cancer Genome Consortium, LIHC = liver hepatocellular carcinoma, TCGA = The Cancer Genome Atlas.

Figure 6 .
Figure 6.The OS curves of high-risk group and low-risk group based on the subgroups of age (A, B), gender (C, D), T stage (E, F), N stage (G) and M stage (H) clinical stage (I, J) and tumor grade (K, L).

Figure 7 .
Figure 7. Risk score related to clinical features in LIHC.The difference of risk score in subgroups of age (A), gender (B), T stage (C), tumor grade (D) and clinical stage (E).LIHC = liver hepatocellular carcinoma.

Figure 8 .
Figure 8.The association between risk score and immune cell infiltration.No statistical difference was found in StromalScore (A) and ESTIMATEScore (C) between high-risk and low-risk group.While high-risk group had a higher ImmuneScore (B) than low-risk group.(D) The infiltrating levels of 24 kinds of immunecytos in the 2 groups.Note: *P < .05,**P < .01,***P < .001.